Identifying Word Translation in Non_Parallel Texts
نویسنده
چکیده
Common algorithms for sentence and word-alignment allow the automat ic identification of word translations from paxalhl texts. This s tudy suggests tha t the identification of word translations should also be possible with non-paxMlel and even unrelated texts. The method proposed is based on the assumption tha t there is a correlation between the pat terns of word cooccurrences in texts of different languages. 1 I n t r o d u c t i o n In a number of recent studies it has been shown that word translations can be automatical ly derived from the statistical distribution of words in bilingual paxallel texts (e. g. Catizone, Russell & Warwick, 1989; Brown et al., 1990; Dagan, Church & Gale, 1993; Kay & Rbscheisen, 1993). Most of the proposed algorithms first conduct an alignment of sentences, i. e. those palxs of sentences axe located that are translations of each other. In a second step a word alignment is performed by analyzing the correspondences of words in each pair of sentences. The results achieved with these algorithms have been found useful for the compilation of dictionaries, for checking the consistency of terminological usage in translations, and for assisting the terminological work of translators and interpreters. However, despite serious efforts in the compilation of corpora (Church & Mercer, 1993; Armstrong & Thompson, 1995) the availability of a large enough paxallel corpus in a specific field and for a given pair of languages will always be the exception, not the rule. Since the acquisition of non-paxallel texts is usually much easier, it would be desirable to have a program that can determine the translations of words from comparable or even unrelated texts.
منابع مشابه
Equivalence in Technical Texts: The Case of Accounting Terms in English-Persian Dictionaries
Translating accounting documents, in general, and accounting terminology, in particular, is not a simple task, especially when the new terms keep created in pace with accounting developments. This study was carried out to find the most common and preferable ways to translate accounting terms from English into Persian. Also, an attempt was made to identify the frequently used patterns of word-fo...
متن کاملEquivalence in Technical Texts: The Case of Accounting Terms in English-Persian Dictionaries
Translating accounting documents, in general, and accounting terminology, in particular, is not a simple task, especially when the new terms keep created in pace with accounting developments. This study was carried out to find the most common and preferable ways to translate accounting terms from English into Persian. Also, an attempt was made to identify the frequently used patterns of word-fo...
متن کاملIdentifying Word Correspondences in Parallel Texts
Researchers in both machine translation (e.g., Brown et a/, 1990) arm bilingual lexicography (e.g., Klavans and Tzoukermarm, 1990) have recently become interested in studying parallel texts (also known as bilingual corpora), bodies of text such as the Canadian Hansards (parliamentary debates) which are available in multiple languages (such as French and English). Much of the current excitement ...
متن کاملVocabulary Lists for EAP and Conversation Students
Despite the abundance of research investigating general and academic vocabularies and developing dozens of word lists, few studies have compared academic vocabulary with general service word lists such as conversation vocabulary. Many EAP researchers assume that university students need to know all the words in West’s (1953) General Service List (GSL) as a prerequisite to academic words (e.g., ...
متن کاملEnglish Vocabulary for Equine Veterans: How Different from GSL and AWL Words
ESP students are usually suggested to master general and academic word lists such as Wests’ (1953) General Service List (GSL) and Coxhead’s (2000) Academic Word List (AWL) to be able to read their academic texts. However, it seems that university students may not need to learn all the words in the two lists as some words in the lists are of less frequency in academic texts. Moreover, there are ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995